145 research outputs found
Robust Estimation under Heavy Contamination using Enlarged Models
In data analysis, contamination caused by outliers is inevitable, and robust
statistical methods are strongly demanded. In this paper, our concern is to
develop a new approach for robust data analysis based on scoring rules. The
scoring rule is a discrepancy measure to assess the quality of probabilistic
forecasts. We propose a simple way of estimating not only the parameter in the
statistical model but also the contamination ratio of outliers. Estimating the
contamination ratio is important, since one can detect outliers out of the
training samples based on the estimated contamination ratio. For this purpose,
we use scoring rules with an extended statistical models, that is called the
enlarged models. Also, the regression problems are considered. We study a
complex heterogeneous contamination, in which the contamination ratio of
outliers in the dependent variable may depend on the independent variable. We
propose a simple method to obtain a robust regression estimator under
heterogeneous contamination. In addition, we show that our method provides also
an estimator of the expected contamination ratio that is available to detect
the outliers out of training samples. Numerical experiments demonstrate the
effectiveness of our methods compared to the conventional estimators.Comment: 32 pages, 3 figures, 3 table
- …